-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Reduce allocations in {Concat,SelectMany,Append,Prepend}.ToArray by close to 50% #14675
Conversation
array[arrayIndex + i] = enumerator.Current; | ||
} | ||
|
||
Debug.Assert(!enumerator.MoveNext(), $"Expected length of {nameof(source)} to be exactly {count}."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using a for-loop we save 1 MoveNext
call in Release, since we know exactly when the sequence will end. e.g. For 0-length enumerables we would normally call MoveNext
once, but here we will skip the loop entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this requires that MoveNext()
will return true exactly count
times anyway, we could have a conventional foreach
, leave count
out of the call and ++arrayIndex
rather than ++i
and doing the addition.
Which of these micro-opts is the greater is hard to guess at from first principles rather than profiling though (and would depend on MoveNext()
complexity).
Losing the ability to test count
in debug would make some flaws harder to spot, though they should still be obvious enough to the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna While it is true that count
is only of debug value, I don't think think this function semantically makes sense without it. If you're saying it's safe to copy this enumerable to this array, then you would have to know the count of the enumerable (or at least an upper bound thereof) in advance. I doubt there will ever be a callsite for this function that could not provide a count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have decided to remove this optimization and just go with the regular foreach
, however. I don't think calling side-effecting code like MoveNext
only in debug is a good idea, and who knows-- maybe some iterators/enumerators will not work the same if disposed too early.
/// <summary> | ||
/// The index within the buffer to select. | ||
/// </summary> | ||
internal int Column { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-note: This doesn't really need to be a mutable struct. Maybe it would be better to make it immutable. (edit 1 mo. later: done)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe profile that. The effect either way can be surprising sometimes (at least I find so).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonHanna 👍 I doubt it will make much of a difference since this is an 8-byte struct that fits in an x64 register, and it's only being copied once per CopyTo
call. Regardless, I will take your advice.
(edit 1 mo. later: I have had trouble building the repo recently and so am unable to run perf tests. I very much doubt making this struct immutable will have a noticeable impact because an extra copy is only made once per method call. Besides, the struct is only 8 bytes.)
/// </summary> | ||
public T[] ToArray() | ||
{ | ||
if (_count == _first.Length) | ||
T[] array; | ||
if (TryMove(out array)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a bit of cleanup-- added a TryMove
function that attempts to get this builder as a single array without copying.
++index; | ||
} | ||
} | ||
EnumerableHelpers.Copy(_source, array, index, count - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some cleanup-- I stumbled across this snippet of code, which looks like it has the exact same semantics as the Copy
function I've introduced.
{ | ||
builder.Add(item); | ||
} | ||
builder.Reserve(_appended.Count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general pattern is
var builder = new SparseArrayBuilder<T>(initialize: true);
...
if (CanPredetermineSizeOfSomeItems)
{
// Semantically equivalent to builder.AddRange(
// Enumerable.Repeat(default(T), PredeterminedSize)),
// except we don't add anything to the buffer.
builder.Reserve(PredeterminedSize);
}
...
// Creates an array with a bunch of zero-initialized
// gaps in it where `Reserve` has been called
T[] array = builder.ToArray();
// Get the list of locations where gaps were added.
ArrayBuilder<Marker> markers = builder.Markers;
for (int i = 0; i < markers.Count; i++)
{
// Each Marker tells us the index of the gap in
// the array, as well as how wide the gap is.
Marker marker = markers[i];
// Copy the corresponding collection/nodes/etc.
// to this gap
...
}
This is not the best example because we don't have to use the markers (we know exactly where the gaps are here, at the start & at the end), but you can see the above code structure in Concat
& SelectMany
.
// remove this workaround. | ||
Debug.Assert(initialize); | ||
|
||
_builder = new LargeArrayBuilder<T>(initialize: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into it extensively yet, but I'm conjecturing that some of the speed-wise regressions in the tests for small collections could be abated if we could switch to the : this()
-style pattern here and in LargeArrayBuilder
. The JIT does not handle large structs very well yet, and additionally will not be able to inline either of these methods. (edit: Made the changes)
It certainly seems soundly reasoned. I think passing the |
Test Innerloop Ubuntu14.04 Debug Build and Test |
I posted some benchmarks for |
@VSadov @OmarTawfik can you please review the change when you get a chance? |
LGTM. @VSadov? |
e59cae1
to
f838d32
Compare
e193592
to
c410a22
Compare
@VSadov PTAL when you have time-- I have tidied up this PR and posted perf results at #14675 (comment) |
@VSadov @OmarTawfik ping? |
LGTM |
Reduce allocations in {Concat,SelectMany,Append,Prepend}.ToArray by close to 50% Commit migrated from dotnet/corefx@a75d133
Background: Some
ToArray
implementations employ a double-buffering strategy if we don't know the exact length of the output array. First the items are copied into some sort of buffer, then they are transferred into an array. For example, in the following snippetIn
ToArray
, we know the size of the lhs enumerable but not how many elements are in the rhs one. So we allocate a buffer with 100+ elements to hold everything, calculate the final size of the array, and re-copy those elements.Description: Instead of copying everything twice, we can buffer only the enumerables we can't get the count of. If we reach an
ICollection
or anIIListProvider
and it has N items, we queue it to be copied later, and tell the builder to leave a gap of N items in the output array. Finally, whenToArray
is called, we revisit all of those queued collections and copy them directly to the output.Here is how the above example works now:
Performance results: An initial test is showing almost 50% decrease in GCs in
SelectMany(e => e).ToArray()
, assuminge
is always anICollection
. results / testMethods this affects:
Concat().ToArray()
,SelectMany().ToArray()
,Append/Prepend().ToArray()
I had to pull in the changes from #13942 because this affects
SelectMany
, so the LOC in the diffs is exaggerated. Once/if that's merged, I'll rebase.cc @stephentoub @VSadov @JonHanna